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Abstract 

We propose three methods for forecasting a time series modeled using a func¬ 
tional coefficient autoregressive model (FCAR) £t via spline-backfitted local 
linear (SELL) smoothing. The three methods are a “naive” plug-in method, a 
bootstrap method, and a multistage method. We present asymptotic results 
of the SELL estimation method for FCAR models and show the estimators 
are oracally efficient. The three forecasting methods are compared through 
simulation. We find that the naive method performs just as well as the mul¬ 
tistage method and even outperforms it in some situations. We apply the 
naive and multistage methods to solar irradiance data and compare fore¬ 
casts based on our method to those of a linear AR model, the model most 
commonly applied in the solar energy literature. 

Keywords: Functional-coefficient autoregressive model. Spline smoothing. 
Local linear estimation, Oracle smoothing, Eootstrap forecast. Multistage 
forecast 
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1. Introduction 


There are many useful time series models that lie between the class of 
linear, fully parametric models, and nonlinear nonparametric models. One 
such model is the functional coefficient autoregressive (FCAR) model, dehned 
as 

p 

Xt = '^ma{Ut-d)Xt-a + (j(yt,^t)£t, t = (1) 

a=l 


where p is a positive integer, nia iUt) is a measurable function of Ut, for a = 
1,..., p, (Vi, Xi) is a variance function dependent on = (Xi,..., Xn-d) 
and Xi = (Xi,..., X„)\ and {st} is a sequence of i.i.d. random variables with 
mean 0 and variance < oo. Usually the variable Ut is taken to be a lagged 
value of the series; i.e., Ut = Xt-d, where d is a positive integer. Although 
the FCAR model imposes an autoregressive structure, its flexibility lies in 
allowing the autoregressive coefficients ruo,, to vary as a function of Ut- While 
reducing the size of the class of nonlinear models, the class of FCAR models 
is broad enough to include some common nonlinear time series models as 
specihc cases. Among these are the threshold autoregressive (TAR) model 


of Tong (1983), the exponential autoregressive (EXPAR) model of [Haggan 


and Ozaki (1981), and the smooth transition autoregressive (STAR) model 


of 

Chan and Tong I 

1986 

). 


Chen and Tsay 

(1993 

) introduced the FCAR model and proposed a pro 


cedure for building the model based on arranged local autoregression which 
constructs estimators based on an iterative recursive formula that resem¬ 
bles local constant smoothing. They compared the proposed model building 
procedure to threshold and linear time series models through multi-step fore¬ 
casts. The FCAR model performed much better than the other two models 
in terms of bias. However, the FCAR model only performed better for short 
term forecasts in terms of mean square error (MSE). For long term forecasts, 
the linear model performed the best in MSE. 


Cai, Fan and Yao (2000) used a local linear htting method to estimate 
m„(-) in lir They used the method on simulated data from an EXPAR 
model and assessed the £t by calculating the square root of the average 
squared errors (RASE). The performance of the method was gauged by com¬ 
paring the RASE to the standard deviation of the time series. Their results 
showed the local linear method provided an adequate £t of the model with 
the RASE well below the standard deviation of the series. Real data exam¬ 
ples were used to assess the post sample forecasting performance of the local 
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linear method. The two examples were the Canadian lynx data set (Tong 
1990, p. 377) and the Wolf’s annual sunspot numbers data set (Tong, ]r990| 


p. 420). The local linear method was compared with the linear AR model, 


the TAR model, and the arranged local regression procedure of Chen and 


Tsay (1993) using a one-step ahead and a iterative two-step ahead forecast. 


In terms of average absolute predictive errors, the local linear method had 
much better performance than both the linear AR model and the TAR model 
in the Canadian Lynx example and performed just as well as the other two 
models in the sunspot numbers example. 


Huang and Shen (2004) propose a global smoothing procedure based on 


polynomial splines for estimating FCAR models. The authors note that the 
spline method yields a htted model with a parsimonious explicit expression 
which is an advantage over the local polynomial method. This feature allows 
one to produce multi-step ahead forecasts conveniently. Additionally, their 
spline method is less computationally intensive than the local polynomial 
method. 

Forecasting for FCAR models was discussed in Harvill and Ray (2005). 
They compared three methods, the hrst of which is the naive forecasts pre¬ 
sented in Fan and Yao (2003). Another method was the multistage method of 


Chen (1996). This method was developed for a general non-linear AR model 


and was adapted for the FCAR model by 

Harvill and Ray ( 

2005) 

. The last 

method was the bootstrapping method of 

Huang and Shen 

(200^ 

). In that 


paper, bootstrapped residuals are added to the forecasted values after htting 
the model with splines. Harvill and Ray (2005) used bootstrapped residuals 
after htting with the local linear method. A comparison of the three methods 
showed that the bootstrap method out performs the other two methods for 
non-linear forecasting and performs well for forecasting a linear process. 

A recent development in estimating nonlinear time series data is the 
spline-backhtted kernel (SBK) method of Wang and Yang (2007). This 
method combines the computational speed of splines with the asymptotic 
properties of kernel smoothing. To estimate a component function in the 
model, all other component functions are “pre-estimated” with splines and 
then the difference is taken of the observed time series and the pre-estimates. 
This difference is then used as pseudo-responses for which kernel smoothing 
is used to estimate the function of interest. By constructing the estimates in 
this way, the method does not suffer from the “curse of dimensionality.” The 


SBK method is adapted for i.i.d. data in Wang and Yang 

(2009 

), to general- 

ized additive models in 

Liu, Yang and Hardle 

(2011 

), and to partially linear 
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additive models in Ma and Yang (2011). In Song and Yang (2010), a spline 


backfitted spline (SBS) procedure is proposed. Liu and Yang (2010) develops 


the SBK method for additive coefficient models which are generalized forms 
of FCAR models. 

Using SBK in forecasting algorithms has not been introduced, and in par¬ 
ticular, not in forecasting with the FCAR model. In this paper, we develop 
new forecasting algorithms that make use of the SBK method, and compare 
the performance of the new method to that of the naive, the bootstrap, and 
the multistage methods for the FCAR model. 

In Section the SBK method for estimating the functional coefficients 
is given. Section presents three forecasting methods that employ the SBK 
estimates. In Section]^ simulation results are used to compare the forecasting 
methods and the methods are applied to solar irradiance data in Section 
We conclude with a discussion in Section [ 6 l 


2. Estimation of Functional Coefficients 

To estimate the functional coefficients ma{Ut), a = 1,2,... ,p in (p!| ), we 
use the oracle smoothing of Linton (1997) and Wang and Yang (2007). For 
the remainder of this paper, we take the variable Ut in equation ([^ to be a 
lagged value of the series; that is, Ut = Xf-d, where d is a positive integer. 
Consider a fixed integer 7 , 1 < 7 < p. If the coefficient functions rria {Xt-d), 
a = l,...,p, for a 7 ^ 7 , are known by “oracle,” then we can construct 
{Xt-d,Xt-^,Y^d}'^^v where 


TTly i^Xt—d) Xt—<y “1“ (J (Xj) Et Xt 


P 


/ j {^Xt — d^ Xt—c 

Q;=l,0:7^7 


from which we can estimate the unknown m.y (Xt-d)- This oracle smoother 
removes the “curse of dimensionality,” since there is only one function being 
estimated. Clearly, the coefficient functions, ma {Xt-d), a = 1,... ,p, a 7 ^ 7 , 


are not known and must be estimated. For additive models, Linton (1997) 


used marginal integration kernel estimates to estimate the functions and 


Wang and Yang 

(2007 

used an under-smoothed spline procedure. We now 

adapt the procedure of 

Wang and Yang 

(2007 

) to estimate the FCAR model. 


We first “pre-estimate” the coefficient functions with a constant spline 
procedure. Let Ut = Xt-d be distributed on the compact interval [a, h] and 
denote the knots as a = kq < < • • ■ < kv < f^N+i = b where the number of 
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interior knots are N ~ In n. The B-spline basis functions are determined 

on the iV + 1 equally spaced intervals with length (6 — a) {N + 1)”^. The 
basis function are dehned as 


Bj {u) 



Kj <X < Kj+i, 

otherwise, 


J = 0 ,...,iV. 


The pre-estimates are dehned as 

N 

hfo (n) ^ ^ ^{N+l)(a—l)+,jBj (ti) , O' 1,... ,p, 

J=0 


where the coefficients (Aq, ..., Ap( 7 v+i)-i) are solutions to the least squares 
problem 


Ao, • • • , Ap(Ar+i)_i > — 


N 


argmin S X(N+i){a-i)+jBj (Ut) Xt-c 

Rp(iV+l)-l ^ ^ ^ ' 


( 2 ) 


a=l \J =0 


Dehne Xt = X, = ..., , U* = (f/i,..., f/,)', 

A = (Ao, . . . , Ap(Ar+i)_i)', 


BoiUi) ■■■ Br,{Ui) 

Bo{U2) B,{U2) ■■■ Br,{U2) 


BoiUn) B^iUn) ■■■ B^iUn) 


and Z = (BoXi, B 0 X 2 , • ■ ■ , BoXp) where o denotes the Hadamard product 
and Xq is a n X {N -|- 1) matrix with Xq, for each column. In matrix notation, 
the least squares estimates are 


A = (Z'Z)“'Z%, 


and the pre-estimates are 

(Ut) = B A(Ar+i)(Q,_i), . . . , Aa(Ar+l)-l 


a = l,...,p. 
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We now define the “pseudo-responses” as 


V 

^ ^ tho (Ut) ^t—ai ^ 1, ... 77.. 

Q=l,a7^7 


Define the vector of pseudo-responses as Y..^ = • 5 Y^,n)'■ The spline- 

backhtted local linear (SELL) estimate for the coefficient function {u) 

is 

msBLLn {u) = (1, 0) (V'WV)-' V'WY.,, (3) 

where 


V = 


Xn—'y 


Xji—'Y {Un 

W = diag{i^, (f/p+i - n),..., - n)}, 

15 


= l-(^) /(W».|<i). 


zero 


/{yi} is an indicator variable equal to one if condition A is true, and is 
otherwise; h is a bandwidth selected by the rule of thumb criterion of [Fan 


and Gijbels (1996). Likewise, we dehne the oracle local linear smoother as 

mo,-y (m) = (V'WV)~W'WY.^, 


Wang and Yang (2007) hrst proposed the method using a Nadaraya-Watson 


smoother in the last step and then proposed using a local linear smoother. 
We use only the local linear smoother in this paper. 


2.1. Asymptotic properties 

The following assumptions are necessary for the data generating pro¬ 
cess to be geometric ergodic and for the asymptotic properties of the SELL 
method. For the interval [a, b] and functions m, let [a, b] = {m|m" G C [a, b]} 
denote the space of second-order continuous smooth functions and let 

Lip ([a, b] ,C) = {m \ \m (u) — m{v)\ < C \u — v\, Vn, v G [a, 6]} 

denote the class of Lipschitz continuous functions for any hxed constant 

G > 0. 
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(Al) The coefficient function (u) E [a, b] and iria (u) E Lip ([a, b ], Coo), 
a = 1 ,... ,p, a 7 ^ 7 , for some constant 0 < Coo < oo. 

(A 2 ) For the process {Ot = {Ut, Xt, Xt-i,... Xt-p, there exist positive 

constants Kq and Aq such that a {k) < holds for all k, with 

the a-mixing coefficients for dehned as 

a{k)= sup \P{BnC)- P{B)P{C)\, k>l. 

-BG(T{0s,s<t},(7Gcr{0s>i+fc} 

(A3) The conditional variance function cr^ (X*) is measurable and bounded. 
The noise St satisfies E (et|Xt) = Q, E (£^|Xt) = 1 and E |Xtj < 

Ms for some 5 > 1/2 and a hnite positive Ms. 

(A4) The delay variable Ut has a continuous probability density function 
/ (m) that satishes 


0 < Cf < inf / (m) < sup / (m) < Cf < oo, 

u&[a,b] ue[a,b] 


for some constants c/ and Cf, and has continuous derivatives on [a, b]. 

(A5) The kernel function K E Lip ([—1,1] ,Ck) for some constant Ck > 0, 
and is bounded, nonnegative, symmetric and supported on [—1,1]. The 
bandwidth is Chn~^^^ < h < Chn~^l^ for some positive constants Ch and 
Ch. 

(A 6 ) The number of interior knots is CTv^^'^^logn < N < logn for 

some positive constants cjq and Cat. 


Assumptions (A1)-(A5) are common in the nonparametric literature; see, 
for example Fan and Gijbels (1996). We apply the SELL method to a TAR 
model in section thus relaxing the smoothness assumption in assumption 
(Al). Theorem 1.1 in Chen and Tsay ( |1993 ) gives sufficient conditions for 
FCAR models to be geometrically ergodic, which implies a-mixing, thus 
satisfying Assumption (A 2 ). Assumption (A 6 ) is used in the pre-estimate 
stage in which we under smooth with the splines to reduce the bias. The 
increase in variance resulting from the pre-estimate stage is reduced in the 
local linear regression stage where bandwidth h is of the order found in 
assumption (A5). When implementing the method, we impose an additional 
constraint on N such that 


N = min 
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This additional constraint ensures that the number of terms in the least 
squares problem ([^ is no greater than n/2. 

To simplify the notation, we denote 

/ oo poo 

x^K (x) dx, Vj = / x^K"^ (x) dx, 

-OO J —oo 

(m) = E (x^x; \Ut = u), 


and 


(m) = E (X^X'^cr^ (Ut, X^) \Ut = u). 


Under assumptions (Al)-(A5), it is straight forward from the results of Cai 


et ah (2000) to verify that, as n —)■ cx). 


\/nh {mo,j (m) — (m) — (m) ^ N (O, (m)) , 


where 


^7 (“) = y (m) , 


2 / \ ^0 
y [u] = 


f{u 


■e\pfl ^ {u) fl* {u) fl ^ {u) e^,p. 


( 4 ) 

( 5 ) 


and e^p is a p x 1 vector with 1 in the jth position. Furthermore, from the 
results of Liu and Yangj (2010) it can be shown that as n —)■ oo, the oracle 
smoother satisfies 

sup |mo ,7 (u) — rriy (m)| = Op ( ] . 

u£[a-\-h^b—h] \\TlhJ 


The following theorem gives the asymptotic uniform magnitude of the diffrence 
between rhsBLL,'y {u) and rho,'y («)• 

Theorem 2.1. Under assumptions (Al)-(A6), as n ^ oo, the SELL esti¬ 
mator fh sell,{ u) given in (|^ satisfies 

sup IrhsBLL,-/ (u) - mo,-t (n)| = Op . 

u£[a,b] 


Theorem 2.1 states that the distance msBLL,'y {u) — mo,'y (u) is of the order 


Op which is dominated by the asymptotic size of mo ,7 (u) — {u). 

This implies that rhsBLL,') (n) will have the same asymptotic distribution as 
hlo ,7 which results in the following theorem. 











Theorem 2.2. Under assumptions (A1)-(A6), as n ^ oo, with b^{u) and 
v^{u) defined in Q and ([^, 


{msBLL,^ (m) - (m) - &7 (m) h^} ^ N (O, v'^ (m)) 


The proofs of Theorems 2T and |2.2| follow from the proofs of Liu and Yang 

pMol ). 

To compare the performance of msBLL,-y (u) to rho,-y (n), we use the oracle 
efficiencies of Wang and Yang (2007) which are defined as 


eff-y = 


Ylt=i {^ 0,7 i^t) - {ut)Y 

YTt=i {fhsBLL,'y {ut) - my {ut)y 


11/2 


( 6 ) 


By Theorems 2.1 and 2.2, effy should approach 1 as n —)■ cx) for all 7 = 
1,... ,p. We demonstrate this result via simulation in Section]^ 


3. Forecasting Methods 


We now present the three forecasting methods discussed in [Harvill and 


Ray (2005) and adapt them to be used with the SELL method. Assuming 


nia (U) is known and U is exogenous, we want to find an estimator of the 
conditional expectation 


E [Xn+M\Xm ■ ■ ■ 1 Xn-p] — E 


n—p 


^ ^ (Un-yM^ ^n-\-M—a\^ni • • • 5 
a=l 

^ ^ ) E ^Xyi^]\/[—q, • • • 5 ^n—pl 

Oi = l 

P 


(7) 




a=l 


The expectation in Q is no longer a simple linear operation when Ut = Xt-d 
for some positive constant d. The three forecasting methods described below 
deal with this expectation in a different way. 


9 






















Naive predictor 

The naive approach simply ignores the fact the expectation in Q is not 
a linear function of Xt+M-a and substitutes Xt+M-a into the forecast equa¬ 
tion. We estimate the coefficient function only using the within-sample series 
values. The naive predictor is dehned as 

p 

^n+M ^ ^ nT-Q, ^XYi-\-M—d'j ^n+M—aj 
a.=l 

where Xt = t < n. For the SELL estimator, Tha (■) is the value obtained 
by the general form of ([^. Thus, the spline pre-estimate is not computed 
for each value of Xn+M-d, but the local linear estimation is computed. 


Bootstrap predictor 

The bootstrap predictor is like the naive predictor in that it estimates 
the functional coefficients using only the within-sample values. However, we 
bootstrap the within-sample residuals from the estimated model and hnd the 
predicted value as 

p 

^n+M = (^^n+M-d^ ^n+M-a + 

a=l 


where is the bootstrapped residual. We obtain bootstrapped forecasts for 
b = 1,... ,B, and use the average of these values as the M-step ahead fore¬ 
cast. For the SELL method, we estimate ma (•) ^^ with the naive predictor. 
An advantage of using the bootstrap values is that the set of all values al¬ 
lows us to estimate the predictive density of W+m- A disadvantage is that 
the estimated functional coefficients may become unreliable when Xt+M-d is 
outside or near the boundary of the range of the original Xt_d. This disad¬ 


vantage was hrst noted by Huang and Shen (2004) and reiterated by Harvill 


and Ray (2005). 


Multistage predictor 

Another way to handle the expectation in ([^ is to incorporate the in¬ 
formation from Xt encoded in the predicted response at time n + j, j = 
1,..., M — 1. This is accomplished by updating the functional coefficients at 
each step and obtaining the forecast by 

p 

■^n+M ^ ^ (^^n+M—d^ ^n+M—ai 

a.=l 
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where Xt = Xt, t < n. The functional coefficient (•) is estimated by 
the SELL method at each step. That is, we include the predicted values Xt, 
t = n + 1 ,..., M — 1, with the original values Xt, t = 1,... ,n, and then 
re-estimate the functional coefficient with the SELL method using the new 
set of data. Clearly, this method is more computationally intensive than 
the naive predictor and possibly the bootstrap predictor, dependent on the 
number of bootstraps taken. 


4. Simulation 

In the following, we investigate the performance of the three forecasting 
methods via simulation. We use three parametric models, described in more 
detail in each of the three examples. An FCAR model is £t to data from the 
three models using the method described in Section]^ and forecasts obtained 
using the methods explained in Section The models in the hrst two ex¬ 
amples satisfy all assumptions provided in Section |2.1[ However, the model 


in the third example is a self-exciting threshold autoregressive model, and 
so the continuity assumption (Al) is not satished. For all three models, we 
provide empirical efficiencies as defined in equation ([^. We also provide the 
root mean square prediction error (RMPE) for the three forecasting meth¬ 
ods. For each model, we ran 500 Monte Carlo iterations for series lengths 
n = 75,150, 250 and 500, and orders p = 4,10. 

Example 4.1. We hrst consider the model 


Xt = '^aa, sin (uTrXt-d) Xt-a + (X^) et, St ^ N (0,1), 


a=l 


where 


a X. =0.1 ^ 


5 - exp (ELi \Xt-a\ /p) 


2 y 5 - exp (EEi \Xt-a\ Ip) ' 

In both cases, we set the delay to d = 2. The term a (Xt) ensures the model 
is heteroscedastic with the variance roughly proportional to p. For p = 4, we 
used the parameters 

a = (0.5,-0.5,0.5,-0.5)' 
and uj = 4.5. For p = 10, we used 

a = (0.5, -0.5, 0.5, -0.5, 0.5, -0.5, 0.5, -0.5, 0.5, -0.5)' 
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and oj = 1.5. These choices for a make the bounds of the coefficient functions 
± IoqI so that the roots of the characteristic polynomial 

— ■ • • — Op = 0 


are inside the unit circle, thus ensuring the process is geometrically ergodic 



Figure [T] about here. 


For comparing the three forecasting methods, we calculate the root mean 
prediction error (RMPE), 


500 


1/2 


RMPEm" — 


500 


E -t 


i=l 


^n+M,2 






M = l,...10, (9) 


where Xn+u is the forecast at time n + M for iteration i, and Xn+M is the 
value at time n + M of the Rh simulated process. The RMPE is calculated 
for each of the three forecasting methods. Figure shows the RMPE^, 
M = 1,... 10, for Example |4.1 for three series lengths. For p = 4, n = 75 
(Figure 2(a)), we can see that the multistage method has lower RMPE for 
M = 1,... 7, but has higher RMPE than the naive method for M = 8, 9,10. 


For p = 10, n = 75 (Figure 2(b)), the naive and multistage methods are 
much closer at the lower values of M with the naive method performing 
better for the larger values of M. For p = 4, the larger series lengths show 
the naive method having RMPE just as low, if not lower, than the other two 
methods (Figures 2(c)| and 2(e)). For p = 10, the larger series lengths show 
the multistage method having the lowest RMPE for some smaller values of 
M with the naive method being lower for the larger values of M (Figures 
2(d)| and 2(f)). The simulation results show that the naive method performs 
just as well, if not better, than both the bootstrap and multistage methods. 
An additional advantage in using the naive method over the other two is that 
it is computationally faster. 
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Figure about here. 

Example 4.2. The next model we consider is the EXPAR model 

p 

Xt = '^ (a« + ba exp Xt-a + cr {X*} St, St N (0,1), 

where a {X^} is dehned in (|^. We use 

a= (0.3,-0.35, 0.1,-0.2)', 

b = (0.2,-0.15, 0.4,-0.3)', 
and 6 = 25 for p = 4, and 

a = (0.3, -0.35, 0.1, -0.2, 0.35, -0.1, .2, -0.3, 0.25, -0.25)', 

b = (0.2, -0.15,0.4, -0.3,0.15, -0.4, 0.3, -0.2, 0.25, -0.25)', 
and 5 = 5 for p = 10. For both cases, we set the delay variable to Xt- 2 - As 


was the case in Example 4T, the values of a and b are determined so that 
the bounds of the coefficient functions ensure the process is geometrically 
ergodic. 

The estimated coefficient function using the SELL method and the oracle 
estimate for {Xt- 2 ) with p = 4,10 and n = 75, 500 are shown in Figures 
3(a)| - 3(d) The densities of the empirical efficiencies are shown in Figures 
3(e) and 3(f) For this example, the efficiencies are shown for a = 4 and 


a = 10 for p = 4 and p = 10, respectively. As we see in Example |4.1 the 
mode of the densities tend to be closer to one as n increases. 

Figure 1^ about here. 

The RMPEs for the three forecasting methods are shown in Figure 
For this example, we see the multistage method having the lowest RMPE for 


p = 4 for most values of M (Figures 4(a), 4(c) and 4(e)). Note the difference 
between the multistage and the naive methods are small, particularly for 
n = 500 (Figure |4(e) ). For p = 10, we see the naive method have the 
smallest RMPE for most values of M except for the series length n = 500 


(Figure 4(f)). For n = 500, the multistage method tends to have lower 


RMPE. Again, the differences between the methods are small. 


13 


























Figure |4] about here. 

Example 4.3. For the last example, we relax the continuity Assumption (Al) 
and apply the estimation and forecasting methods to the self-exciting thresh¬ 
old autoregressive model (SETAR) 


Xt — 4>a {Xt-d) Xt-a + £t-, Et ^ N (0, 1) , 


0=1 


where 


For p = 4, we used 


and 


ba i^Xj^—d) 


if Xt-d < To 
if Xt-d > Tc 


a= (0.5, 0.2, 0.1,-0.4)', 
b = (0.4,-0.5,0.5,-0.5)', 


r = (0,-0.1,-0.2,0)'. 

For p = 10, we used 

a = (0.5, 0.2, 0.1, -0.4, 0.4, -0.1, -0.2, -0.5, -0.25, 0.25)', 
b = (0.4, -0.5, 0.5, -0.5, 0.5, -0.5, 0.5, -0.4, 0.5, -0.5)', 

and 

r = (0,-0.1,-0.2,0.1,0.2,0.3,-0.3, 0,0.1,0.2)'. 

For both cases, we used the delay variable Xt-i. We chose the values for 
a, b, and r to ensure the process is geometrically ergodic as was we did in 
Examples |4.1 and 4^ 


The estimated coefficient function using the SELL method and the oracle 
estimate for nia (At_i) with p = 4,10 and n = 75, 500 are shown in Figures 
5(g]- |5(d)[ The densities of the empirical efficiencies are shown in Figures 


5(e) and 5(f) For this example, the efficiencies are shown for a = 3 for 


both p = 4 and p = 10. The estimated coefficient functions looks to be 
biased at the value of the regime r 2 = —0.1. However, the mode of the 
efficiencies still tend to be closer to 1 as the series lengths increase indicating 
that relaxing Assumption (Al) does not affect the asymptotic behavior of 
the SELL method, or, at least, it affects the behavior in the same manner as 
it affects the behavior of the oracle estimator. 
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Figure about here. 

The RMPEs for the three forecasting methods are shown in Figure 
From Figures |6(a)| and 6(b), we see that the multistage method tends to 
have the highest RMPF for the larger values of M when the series lengths 
are n = 75. This inflated RMPF indicates that the SELL method is not 
updating the coefficient functions well in the multistage method. As the 


series lengths increase (Figures 6(c) - 6(f)), the multistage method becomes 
less inflated in RMPF. For series length n = 500, the multistage method has 
the lowest RMPF for most values of M, although, the naive method is close 
just as it was for the previous two examples. 

Figure 1^ about here. 


5. Application to Solar Irradiance 

We now apply the SELL method to solar irradiance data taken from a 
sensor located in Ashland, OR, as part of the University of Oregon Solar 
Radiation Monitoring Laboratory. Many variables affect solar irradiance. 
However, one variable that is more influential than others is the amount of 
cloud cover. At a typical site, a day (or a period of time during a day) 
is classihed as ’’clear sky,” ’’partly cloudy”, or ’’overcast.” In the solar en¬ 
ergy literature, methodology used for htting this type of time series includes 


ARMIA models (Martin et ah 


fc-nearest neighbors (Paoli et ah 


2010), regression analysis (Reikard, 2009), 


2010), and Eayesian models (Paoli et ah 


2010). More recently, in Patrick et ah (2015), an FCAR model is used to ht 


solar irradiance, and is shown to be superior to existing methods in the solar 
energy literature. 

The data in this paper contains measured irradiance in W/m^ at hve 
minute intervals throughout the day. 


Figure 7(a 


contains a plot of the 
data (solid line), with a clear sky model (dotted line) superimposed. In the 
measured irradiance, a diurnal trend is clearly present. In accordance with 
the methods described in Section and we ht an FCAR model to data 
taken on November 11, 2013 (a mostly cloudy day) using the SELL method 
and forecast ten time points beginning at 1:35 PM (all times are PST). 

We begin by removing the diurnal trend in the measure irradiance. Figure 
contains a plot of the measured irradiance which are affected by the 


7(a: 


cloud cover and a theoretical clear sky model; that is, the expected measured 
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irradiance if there were no clouds present in the sky. Even though the data 
we are using is from a mostly cloudy day, we can still see a diurnal trend 
that must be removed. A clear sky model is used to remove this trend. A 
discussion of clear sky irradiance models can be found in Reno et al. (2012). 
For our application, we used the Ineichen clear sky model ([Ineichen and 


Perez, 2002) and calculated the clear sky index with is dehned as the ratio of 


the measured irradiance to the clear sky model. Figure 7(b) [ shows the clear 
sky index for the time interval starting at 8:00 AM {t = 96) and ending at 
4:00 PM {t = 192). 


Figure about here. 


The £t of the FCAR model can be seen in Figure 7(c) We found that 
p = 2 and d = 5 gave the best £t. We also £t a linear AR model of order 4 
which we determined by minimizing the AIC. The MSE’s of the htted models 
were 0.0009 for the FCAR model and 0.0020 for the linear AR model. 

We forecast for M = 1,..., 10 using the naive and multistage methods 
and compared them to forecasts using the linear AR model. Figure 7(d) 


shows the forecasts and the observed values starting at 1:35 PM {t = 163). 
The root squared prediction errors (RSPE) for the three methods are given 
in Table For M = 1,..., 5, the FCAR forecasting methods had lower 
RSPEs with the naive method having the lowest at M = 1,3,4, 5 and the 
multistage method have the lowest at M = 2. At M = 6 ,..., 10, the linear 
AR model had the lowest RSPE. For the smaller values of M, these results 
agree with the simulation results in that the naive method performs just as 
well if not better than the multistage method. 


Table [T] about here. 


6. Discussion 

We have adapted the SELL method to FCAR models and shown that 
these estimators are oracally efficient. We examined the performance of 
naive, bootstrap, and multistage forecasting methods with a model estimated 
with the SELL method. Ey estimating the model in this way, we have shown 
through simulation results that the bootstrap method did not perform as 
well as the other two methods. We have also shown that the naive method 
performs just as well as the multistage method and even outperforms it in 
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some situations. The main advantage to using the naive method is that it is 
much faster computationally than the other two methods. 

For a real world example, we showed the naive and multistage methods 
performing better than a linear AR model when applied to solar irradiance 
data for M = 1,..., 6. The day we selected for our application was mostly 
cloudy throughout the day. Future research will examine the £t of an FCAR 
model to irradiance data when the cloud cover conditions change dnring the 
day. This model will need to incorporate a covariate for the clond cover. In 
this paper, we have shown that the SELL method is adeqnate for fitting the 
model and forecasting in the absence of a covariate as long as the amonnt of 
clond cover is constant thronghont the day. 
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Table 1: Root squared prediction error for M = 1,..., 10 of the naive, multistage, and 
linear AR forecasts. Values in bold indicate the smallest prediction error for that value of 
M. 



M 1 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Naive 

0.001 

0.017 

0.013 

0.014 

0.190 

0.171 

0.175 

0.185 

0.206 

0.231 

Multistage 

0.001 

0.014 

0.026 

0.064 

0.201 

0.154 

0.162 

0.186 

0.209 

0.222 

Linear AR 

0.203 

0.210 

0.227 

0.266 

0.204 

0.111 

0.080 

0.034 

0.006 

0.027 
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Efficiency density for p = 4, a = 1 


Efficiency density for p = 10, a = 1 




(e) 


(f) 


Figure 1: Plots (a) - (d) are graphs of the true coefficient function (solid line), the SELL 
estimate (dashed line), and the oracle estimate (dotted line) for Example |4.1| : (a) p = 4, 
n = 75, (b) p = 10, n = 75, (c) p = 4, n = 500, (d) p = 10, n = 500. Plots (e) and (f) 
contain the empirical efficiency densities for Example 4.1 with series lengths n = 75 (thin 
solid line), 150 (dashed fine), 250 (dotted fine), and 500 (thick solid fine) for p = 4 (e) and 
p = 10 (f). 22 











RMPE for p = 4, n = 75 


RMPE for p = 10, n = 75 




Figure 2: Plots of the RMPE for Example |4.1| For each of the plots, the solid line 
represents the naive forecast, the dashed line represents the bootstrap forecast, and the 
dotted line represents the multistage forecast, (a) p = 4, n = 75, (b) p = 10, n = 75, (c) 
p = 4, n = 250, (d) p = 10, n = 250, (e) p = 4, n = 500, (f) p = 10, n = 500. 
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Estimation for p = 4, a = 4, n = 500 



Xt_2 


Estimation for p = 10, a = 10, n = 500 



(c) 


(d) 


Efficiency density for p = 4, a = 4 


Efficiency density for p = 10, a = 10 




(e) 


(f) 


Figure 3: Plots (a) - (d) are graphs of the true coefficient function (solid line), the SELL 
estimate (dashed line), and the oracle estimate (dotted line) for Example |4.2| : (a) p = 4, 
n = 75, (b) p = 10, n = 75, (c) p = 4, n = 500, (d) p = 10, n = 500. Plots (e) and (f) 
contain the empirical efficiency densities for Example 4.2 with series lengths n = 75 (thin 
solid line), 150 (dashed fine), 250 (dotted fine), and 500 (thick solid fine) for p = 4 (e) and 
p = 10 (f). 24 
















RMPE for p = 4, n = 75 


RMPE for p = 10, n = 75 




Figure 4: Plots of the RMPE for Example |4.2| For each of the plots, the solid line 
represents the naive forecast, the dotted line represents the bootstrap forecast, and the 
dashed line represents the multistage forecast, (a) p = 4, n = 75, (b) p = 10, n = 75, (c) 
p = 4, n = 250, (d) p = 10, n = 250, (e) p = 4, n = 500, (f) p = 10, n = 500. 
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Estimation for p = 4, a = 3, n = 75 


Estimation for p = 10, a = 3, n = 75 



(a) 


(b) 


Estimation for p = 4, a = 3, n = 500 


Estimation for p = 10, a = 3, n = 500 



(c) 


(d) 


Efficiency density for p = 4, a = 2 


Efficiency density for p = 10, a = 2 




(e) 


(f) 


Figure 5: Plots (a) - (d) are graphs of the true coefficient function (solid line), the SELL 
estimate (dashed line), and the oracle estimate (dotted line) for Example |4.3| : (a) p = 4, 
n = 75, (b) p = 10, n = 75, (c) p = 4, n = 500, (d) p = 10, n = 500. Plots (e) and (f) 
contain the empirical efficiency densities for Example 4.1 with series lengths n = 75 (thin 
solid line), 150 (dashed fine), 250 (dotted fine), and 500 (thick solid fine) for p = 4 (e) and 
P = 10 (f). 26 



























RMPE for p = 4, n = 75 


RMPE for p = 10, n = 75 



M 
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(e) 


(f) 


Figure 6: Plots of the RMPE for Example |4.3| For each of the plots, the solid line 
represents the naive forecast, the dotted line represents the bootstrap forecast, and the 
dashed line represents the multistage forecast, (a) p = 4, n = 75, (b) p = 10, n = 75, (c) 
p = 4, n = 250, (d) p = 10, n = 250, (e) p = 4, n = 500, (f) p = 10, n = 500. 
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Figure 7: Plots of (a) measured irradiance (solid line) and clear sky model (dotted line) 
for November 11, 2013, (b) the clear sky irradiance transformation between 8:00 AM to 
4:00 PM, (c) the observed transformed data (solid line) htted FCAR model using the 
SBK method (dashed line) and of a linear AR(4) model (dot-dash line), (d) observed 
transformed data (thin solid line) forecasts for M = 1,..., 10 starting at 1:30 PM using 
the naive method (thick solid line), the multistage method (dashed line), and a linear 
AR(4) (dot-dash line) model. 










